Skip to content

Replace dead feature-powerset CI job with curated feature check#1601

Open
Jarema wants to merge 2 commits into
mainfrom
ci/curated-feature-check
Open

Replace dead feature-powerset CI job with curated feature check#1601
Jarema wants to merge 2 commits into
mainfrom
ci/curated-feature-check

Conversation

@Jarema

@Jarema Jarema commented Jun 12, 2026

Copy link
Copy Markdown
Member

Problem

The check_features job in main.yml runs cargo hack check --feature-powerset. With ~18 features the powerset explodes into hundreds of thousands of combinations, and the job has been cancelled at the 6h GitHub Actions cap on every recent push to main — feature-combination checking has never actually passed, while burning 6h of CI per push. Meanwhile async-nats/.cargo-hack-check.sh (written precisely to avoid the explosion) was referenced by no workflow.

This is exactly how feature-gated breakage can ship undetected (e.g. cfg-gated builder code only compiled under specific flag sets).

Changes

  • main.yml: check_features now runs bash .cargo-hack-check.sh with timeout-minutes: 45 instead of the powerset.
  • test.yml: new check_features job — feature combinations are now checked on pull requests, not only after merge.
  • .cargo-hack-check.sh: extended for recent feature additions — server_2_14 (now in defaults), fips, and experimental. Chrono-backend combinations were already covered.

Verification

Full script run locally with RUSTFLAGS="-D warnings": all strategies pass (--each-feature with ring, curated combinations incl. chrono backend, aws-lc-rs/fips, all server version flags, experimental).

🤖 Generated with Claude Code

The check_features job ran cargo hack --feature-powerset, which exceeds
the 6h GitHub Actions cap and has been cancelled on every push to main —
feature-combination checking never actually passed, while burning 6h of
CI per push.

Use the curated .cargo-hack-check.sh instead (written for exactly this),
extend it to cover the recently added server_2_14 default feature plus
fips and experimental, add a 45-minute job timeout, and run the check on
pull requests too so feature-gated breakage is caught before merge.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@synadia-claude-reviewer

synadia-claude-reviewer Bot commented Jun 12, 2026

Copy link
Copy Markdown

Claude finished @Jarema's task in 2m 26s —— View job


PR Review

  • Gather context (PR diff, changed files)
  • Review .github/workflows/main.yml changes
  • Review .github/workflows/test.yml changes
  • Review async-nats/.cargo-hack-check.sh changes
  • Post final review summary

Review

The fix is correct and the approach is sound. Replacing a powerset that mathematically cannot complete (~18 features → billions of combinations, always hitting the 6h cap) with a curated set is the right call. No bugs, security issues, or correctness problems found.

Coverage analysis of the script:

Strategy What it checks Verdict
1: --each-feature + ring Each feature in isolation ✅ Catches cfg-gated code broken alone
2: Curated combinations Minimal, JetStream-only, KV, object-store, full stack, chrono variants ✅ Catches interaction bugs
3: aws-lc-rs + fips Alternative crypto backend, FIPS mode ✅ The new fips entry is correct — cargo check runs build scripts but doesn't link, so it works in normal CI
4: Server version flags server_2_10/11/12/14 individually ✅ Correct — no server_2_13 because it was never added to Cargo.toml
5: Experimental Empty marker feature experimental = [] so this just confirms the feature flag name is valid

One minor observation — RUSTFLAGS parity:

The new check_features job in test.yml inherits RUSTFLAGS="-D warnings" from the workflow-level env: block (line 16). The equivalent job in main.yml does not (main.yml only sets CARGO_INCREMENTAL and CARGO_PROFILE_DEV_DEBUG). This means the PR gate is stricter than the post-merge check. Not a problem in practice — the PR check runs first and is the actual gate — but it means a warning that would fail a PR check might slip through post-merge if main is ever pushed to directly.

This isn't introduced by the PR (the old powerset job also lacked RUSTFLAGS), but it's worth a follow-up to add RUSTFLAGS: "-D warnings" to main.yml's env to keep the two workflows consistent.

Stale comment (preexisting, not introduced here): main.yml line 73 says # we only build docs, so we don't have the full build to be cached. — that was copy-pasted from the check_docs job and was already wrong before this PR.

The changes are ready to merge as-is. The RUSTFLAGS parity and stale comment can be a follow-up.

Signed-off-by: Tomasz Pietrek <tomasz@synadia.com>

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 505ea054b7

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment thread nats-server/src/lib.rs

let retry_strategy = tokio_retry::strategy::ExponentialBackoff::from_millis(500).take(3);
let mut stream = tokio_retry::Retry::spawn(retry_strategy, || {
let mut stream = tokio_retry::Retry::start(retry_strategy, || {

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep tokio-retry API within declared version range

This call depends on the just-added Retry::start API, but nats-server/Cargo.toml still declares tokio-retry = "0.3.0", so Cargo may legally resolve 0.3.0 or 0.3.1 from an existing/minimal lockfile. I checked the docs.rs API for those versions and they expose Retry::spawn but not start; start only appears in 0.3.2. In those environments the nats-server test package fails to compile before this retry is ever exercised, so either keep using spawn or raise the dependency lower bound to a version that has start.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant